Corn, wheat and soybeans are the three major crops that are produced in the United States. In particularly, Midwest has the stable climate to grow the crops. Considering the limited farmland and seasonality, farmers have to think how to fully utilize this land to get the most out of it. Below is the summary of the planting and harvest for each crops. According to www.thebalance.com,
Corn – Plant: Apr - June – Harvest: Oct - Nov
Wheat – Plant: Apr - May – Harvest: Aug - Sept
Soybeans – Plant: Apr - June – Harvest: Sept - Nov
What is the nature of crops market in general and in particular for this data set?
How would the performance of these commodities affect the timing of farming?
How would we manage the allocation of existing resources given we have just landed in this new market?
Volatility is rarely constant and often has a structure (mean reversion) and is dependent on the past.
Extreme events are likely to happen with other extreme events.
Negative returns are more likely than positive returns (left skew).
Corn and wheat have a moderate correlation of 0.63.
Corn and soybeans have a slight weaker correlated than corn and wheat. The correlation value is 0.58.
Wheat and soybeans have fair correlation of 0.36.
There is a positive spillover effect from wheat into corn. In other words, the volatility of the corn market and wheat market are closely related.
We can sensitize our analysis with the range of upper and lower bounds on the parameter estimates of the relationship between correlation and volatility.
The log()-log() transformation allows us to interpret the regression coefficients as elasticities, which vary with the quantile. The larger the elasticity, especially if the absolute value is greater than one, the more risk dependence one market has on the other.
Historical data 2013-2018
-Adapted source from http://www.futuresknowledge.com
| mean | median | std_dev | IQR | skewness | kurtosis | |
|---|---|---|---|---|---|---|
| corn | -0.0226 | 0.0000 | 1.3365 | 1.5643 | -0.0809 | 5.1907 |
| wheat | -0.0182 | -0.1039 | 1.6862 | 2.1406 | 0.2823 | 3.6911 |
| soybeans | -0.0380 | 0.0000 | 1.2382 | 1.4473 | -0.2258 | 5.2570 |
pdf
2
pdf
2
The weights for the Markowitz tangency portfolio (“*“) are
corn wheat soybeans
0.3143 0.1700 0.5157
The weights for the QR tangency portfolio are
[1] -6.62 2.59 5.03
For the working capital accounts:
We need to know how much cash are needed to support the risk tolerance and threshold. To accomplish that, we set that the organization’s tolerance for risk at 5%: that is, a maximum of 5% of the time could losses exceed 10%. The organization also sets the collateral at 2% and a 25% standard deviation. Then we need to solve w in the equation:
\[ z = \frac{- 0.1 - 0.1w - 0.02 ( 1 - w )}{0.25w} \]
or
\[ NormalInverse [ Normal ( \frac{- 0.12 - 0.12w}{0.25w} )] = NormalInverse(0.05) \]
We calculate NormalInverse(0.05) by performingqnorm(0.05)[1] -1.645
then, to solve w:
risky <- -0.12/(0.25*(-1.64)+0.12)
paste('The weight invested in the risky contract', risky)[1] "The weight invested in the risky contract 0.413793103448276"
Therefore, the risky contract value = 42% of the portfolio value.
The upper red line and the lower red line forms a boundary to tell us what is happening to our exceedences. The middle, black line is the shape of parameter at various threshold and their corresponding exceedences. ### GPD fits and starts
All crops have experienced spikes in price and magnitude percentage change. Common shared factors are climate and competitors. Corn vs wheat and corn vs soybeans have moderate positive correlation (corn vs wheat is 0.63; corn vs soybeans is 0.58.
The value at risk of the crops mix would be matched to the market performance ergo value of the commodities. The timing of the farming directly correlates to the proportionate projected volatility of the crops mix. It shows that certain mixes of crops and will provide us with less volatility and more value for our risk associated. Corn, wheat and soybeans all have a very low mean. They also have smaller standard deviations which proves to be more consistent and at less risk when developing what will be necessary to achieve optimal size and mix.
The working capital account of $100 million should be allocated as follows: buy $31 million in corns, $17 million in wheat, $52 million in soybeans.
R Packages used
ggplot, scales, quadprog, quantreg, shiny, flexdashboard, qrmdata, xts, matrixStats, zoo, QRM, plotly, and psych
References - Github and Stack Overflow for trouble shooting
Brian Peirce for the help in brainstorming business questions
www.thebalance.com
www.macrotrend.net
https://turing.manhattan.edu/~wfoote01/finalytics/_site/index.html
---
title: "Final Project"
author: Teng Siong "T.S" Yeap
output:
flexdashboard::flex_dashboard:
orientation: columns
source_code: embed
theme: "bootstrap"
---
```{r, include=FALSE}
knitr::opts_chunk$set(echo = FALSE, warning = FALSE, message = FALSE)
rm(list = ls())
library(ggplot2)
library(flexdashboard)
library(shiny)
library(QRM)
library(qrmdata)
library(xts)
library(zoo)
library(plotly)
#library(ggfortify)
library(psych)
library(matrixStats)
library(moments)
library(quantreg)
library(quadprog)
library(scales)
# PAGE: Exploratory Analysis
data <- na.omit(read.csv("cropsdata.csv", header = TRUE))
prices <- data
# Compute log differences percent using as.matrix to force numeric type
data.r <- diff(log(as.matrix(data[, -1]))) * 100
# Create size and direction
size <- na.omit(abs(data.r)) # size is indicator of volatility
#head(size)
colnames(size) <- paste(colnames(size),".size", sep = "") # Teetor
direction <- ifelse(data.r > 0, 1, ifelse(data.r < 0, -1, 0)) # another indicator of volatility
colnames(direction) <- paste(colnames(direction),".dir", sep = "")
# Convert into a time series object:
# 1. Split into date and rates
dates <- as.Date(data$DATE[-1], "%m/%d/%Y")
dates.chr <- as.character(data$DATE[-1])
#str(dates.chr)
values <- cbind(data.r, size, direction)
# for dplyr pivoting and ggplot2 need a data frame also known as "tidy data"
data.df <- data.frame(dates = dates, returns = data.r, size = size, direction = direction)
data.df.nd <- data.frame(dates = dates.chr, returns = data.r, size = size, direction = direction, stringsAsFactors = FALSE)
#non-coerced dates for subsetting on non-date columns
# 2. Make an xts object with row names equal to the dates
data.xts <- na.omit(as.xts(values, dates)) #order.by=as.Date(dates, "%d/%m/%Y")))
#str(data.xts)
data.zr <- as.zooreg(data.xts)
returns <- data.xts # watch for this data below!
# PAGE: Market risk
corr_rolling <- function(x) {
dim <- ncol(x)
corr_r <- cor(x)[lower.tri(diag(dim), diag = FALSE)]
return(corr_r)
}
vol_rolling <- function(x){
library(matrixStats)
vol_r <- colSds(x)
return(vol_r)
}
ALL.r <- data.xts[, 1:3]
window <- 90 #reactive({input$window})
corr_r <- rollapply(ALL.r, width = window, corr_rolling, align = "right", by.column = FALSE)
colnames(corr_r) <- c("corn.wheat", "corn.soybeans", "wheat.soybeans")
vol_r <- rollapply(ALL.r, width = window, vol_rolling, align = "right", by.column = FALSE)
colnames(vol_r) <- c("corn.vols", "wheat.vols", "soybeans.vols")
year <- format(index(corr_r), "%Y")
r_corr_vol <- merge(ALL.r, corr_r, vol_r, year)
# Market dependencies
#library(matrixStats)
R.corr <- apply.monthly(as.xts(ALL.r), FUN = cor)
R.vols <- apply.monthly(ALL.r, FUN = colSds) # from MatrixStats
# Form correlation matrix for one month
R.corr.1 <- matrix(R.corr[20,], nrow = 3, ncol = 3, byrow = FALSE)
rownames(R.corr.1) <- colnames(ALL.r[,1:3])
colnames(R.corr.1) <- rownames(R.corr.1)
R.corr.1
R.corr <- R.corr[, c(2, 3, 6)]
colnames(R.corr) <- c("corn.wheat", "corn.soybeans", "wheat.soybeans")
colnames(R.vols) <- c("corn.vols", "wheat.vols", "soybeans.vols")
R.corr.vols <- na.omit(merge(R.corr, R.vols))
year <- format(index(R.corr.vols), "%Y")
R.corr.vols.y <- data.frame(corn.correlation = R.corr.vols[,1], wheat.volatility = R.corr.vols[,5], year = year)
corn.vols <- as.numeric(R.corr.vols[,"corn.vols"])
wheat.vols <- as.numeric(R.corr.vols[,"wheat.vols"])
soybeans.vols <- as.numeric(R.corr.vols[,"soybeans.vols"])
library(quantreg)
taus <- seq(.05,.95,.05) # Roger Koenker UI Bob Hogg and Allen Craig
fit.rq.corn.wheat <- rq(log(corn.wheat) ~ log(wheat.vols), tau = taus, data = r_corr_vol)
fit.lm.corn.wheat <- lm(log(corn.wheat) ~ log(wheat.vols), data = r_corr_vol)
#' Some test statements
corn.wheat.summary <- summary(fit.rq.corn.wheat, se = "boot")
```
Decision
=======================================================================
Background
-----------------------------------------------------------------------
### Overview
Corn, wheat and soybeans are the three major crops that are produced in the United States. In particularly, Midwest has the stable climate to grow the crops. Considering the limited farmland and seasonality, farmers have to think how to fully utilize this land to get the most out of it. Below is the summary of the planting and harvest for each crops. According to www.thebalance.com,
- Corn -- **Plant**: Apr - June -- **Harvest**: Oct - Nov
- Wheat -- **Plant**: Apr - May -- **Harvest**: Aug - Sept
- Soybeans -- **Plant**: Apr - June -- **Harvest**: Sept - Nov
####
###Business questions
1. What is the nature of crops market in general and in particular for this data set?
2. How would the performance of these commodities affect the timing of farming?
3. How would we manage the allocation of existing resources given we have just landed in this new market?
Row
-----------------------------------------------------------------------
###Stylized Facts of the Crops Market
- Volatility is rarely constant and often has a structure (mean reversion) and is dependent on the past.
- Extreme events are likely to happen with other extreme events.
- Negative returns are more likely than positive returns (left skew).
- Corn and wheat have a moderate correlation of 0.63.
- Corn and soybeans have a slight weaker correlated than corn and wheat. The correlation value is 0.58.
- Wheat and soybeans have fair correlation of 0.36.
- There is a positive spillover effect from wheat into corn. In other words, the volatility of the corn market and wheat market are closely related.
- We can sensitize our analysis with the range of upper and lower bounds on the parameter estimates of the relationship between correlation and volatility.
- The log()-log() transformation allows us to interpret the regression coefficients as elasticities, which vary with the quantile. The larger the elasticity, especially if the absolute value is greater than one, the more risk dependence one market has on the other.
Data
=======================================================================
Column
-----------------------------------------------------------------------
### Data Definitions
- **Corn**: daily corn price (\$/per bushel)
- **Wheat**: daily wheat prices (\$/per bushel)
- **Soybeans**: daily soybeans prices (\$/per bushel)
### Crops Price Percent Changes
```{r}
p <- autoplot.zoo(data.xts[,1:3]) # + ggtitle(title.chg1) #+ ylim(-5, 5)
ggplotly(p)
```
Row
-----------------------------------------------------------------------
### Initial Analysis of Corn, Wheat and Soybeans
Historical data 2013-2018
- All crops have experienced a number of spikes in price and magnitude percentage change.
- Factors that affect the corn price: ethanol effect, crude oil prices, investor/speculator effect, climate, Chinese effect, Geoplotical issues
- Factors that affect the wheat price: economy, inflation, production outlook in top wheat producing countries, climate, corn/wheat spread
- Factors that affect the soybeans price: economy, climate, US production, the corn effect, Brazil production, Chinese demand, global production outlook, GMO soybean effect
-Adapted source from http://www.futuresknowledge.com
### Size of Crops Price Percent Changes
```{r}
p <- autoplot.zoo(abs(data.xts[,4:6])) # + ggtitle(title.chg2) #+ ylim(-5, 5)
ggplotly(p)
```
Exploratory Analysis
=======================================================================
Row {.tabset .tabset-fade}
-----------------------------------------------------------------------
###Corn Loss Distribution
```{r}
returns1 <- returns[,1]
colnames(returns1) <- "Returns" #kluge to coerce column name for df
returns1.df <- data.frame(Returns = returns1[,1], Distribution = rep("Historical", each = length(returns1)))
alpha <- 0.95
# Value at Risk
VaR.hist <- quantile(returns1,alpha)
VaR.text <- paste("Value at Risk =", round(VaR.hist, 2))
# Determine the max y value of the desity plot.
# This will be used to place the text above the plot
VaR.y <- max(density(returns1.df$Returns)$y)
# Expected Shortfall
ES.hist <- median(returns1[returns1 > VaR.hist])
ES.text <- paste("Expected Shortfall =", round(ES.hist, 2))
p <- ggplot(returns1.df, aes(x = Returns, fill = Distribution)) + geom_density(alpha = 0.5) +
geom_vline(aes(xintercept = VaR.hist), linetype = "dashed", size = 1, color = "firebrick1") +
geom_vline(aes(xintercept = ES.hist), size = 1, color = "firebrick1") +
annotate("text", x = 2+ VaR.hist, y = VaR.y*1.05, label = VaR.text) +
annotate("text", x = 1.5+ ES.hist, y = VaR.y*1.1, label = ES.text) + scale_fill_manual( values = "dodgerblue4")
p
```
###Wheat Loss Distribution
```{r}
returns1 <- returns[,2]
colnames(returns1) <- "Returns" #kluge to coerce column name for df
returns1.df <- data.frame(Returns = returns1[,1], Distribution = rep("Historical", each = length(returns1)))
alpha <- 0.95
# Value at Risk
VaR.hist <- quantile(returns1,alpha)
VaR.text <- paste("Value at Risk =", round(VaR.hist, 2))
# Determine the max y value of the desity plot.
# This will be used to place the text above the plot
VaR.y <- max(density(returns1.df$Returns)$y)
# Expected Shortfall
ES.hist <- median(returns1[returns1 > VaR.hist])
ES.text <- paste("Expected Shortfall =", round(ES.hist, 2))
p <- ggplot(returns1.df, aes(x = Returns, fill = Distribution)) + geom_density(alpha = 0.5) +
geom_vline(aes(xintercept = VaR.hist), linetype = "dashed", size = 1, color = "firebrick1") +
geom_vline(aes(xintercept = ES.hist), size = 1, color = "firebrick1") +
annotate("text", x = 2+ VaR.hist, y = VaR.y*1.05, label = VaR.text) +
annotate("text", x = 1.5+ ES.hist, y = VaR.y*1.1, label = ES.text, size = 3.5) + scale_fill_manual( values = "dodgerblue4")
p
```
###Soybeans Loss Distribution
```{r}
returns1 <- returns[,3]
colnames(returns1) <- "Returns" #kluge to coerce column name for df
returns1.df <- data.frame(Returns = returns1[,1], Distribution = rep("Historical", each = length(returns1)))
alpha <- 0.95
# Value at Risk
VaR.hist <- quantile(returns1,alpha)
VaR.text <- paste("Value at Risk =", round(VaR.hist, 2))
# Determine the max y value of the desity plot.
# This will be used to place the text above the plot
VaR.y <- max(density(returns1.df$Returns)$y)
# Expected Shortfall
ES.hist <- median(returns1[returns1 > VaR.hist])
ES.text <- paste("Expected Shortfall =", round(ES.hist, 2))
p <- ggplot(returns1.df, aes(x = Returns, fill = Distribution)) + geom_density(alpha = 0.5) +
geom_vline(aes(xintercept = VaR.hist), linetype = "dashed", size = 1, color = "firebrick1") +
geom_vline(aes(xintercept = ES.hist), size = 1, color = "firebrick1") +
annotate("text", x = 2+ VaR.hist, y = VaR.y*1.05, label = VaR.text) +
annotate("text", x = 1.5+ ES.hist, y = VaR.y*1.1, label = ES.text, size = 3) + scale_fill_manual( values = "dodgerblue4")
p
```
### Corn, Wheat, Soybeans
```{r}
p <- autoplot.zoo(ALL.r)
ggplotly(p)
```
### Volatility
```{r}
p <- autoplot.zoo(abs(ALL.r))
ggplotly(p)
```
### Persistence 1
```{r }
acf(coredata(data.xts[,1:3])) # returns
```
### Persistence 2
```{r}
acf(coredata(data.xts[,4:5])) # sizes
```
### Statistics
```{r}
## data_moments function
## INPUTS: r vector
## OUTPUTS: list of scalars (mean, sd, median, skewness, kurtosis)
data_moments <- function(data){
library(moments)
library(matrixStats)
mean.r <- colMeans(data)
median.r <- colMedians(data)
sd.r <- colSds(data)
IQR.r <- colIQRs(data)
skewness.r <- skewness(data)
kurtosis.r <- kurtosis(data)
result <- data.frame(mean = mean.r, median = median.r, std_dev = sd.r, IQR = IQR.r, skewness = skewness.r, kurtosis = kurtosis.r)
return(result)
}
# Run data_moments()
answer <- data_moments(data.xts[, 1:3])
# Build pretty table
answer <- round(answer, 4)
knitr::kable(answer)
```
Market Risk
=======================================================================
Row {.tabset}
-----------------------------------------------------------------------
### Corn and Wheat (90 day rolling correlation)
```{r }
p <- ggplot(r_corr_vol, aes(x = index(r_corr_vol), y = corn.wheat)) + geom_line()
ggplotly(p)
```
### Corn and Soybeans (90 day rolling correlation)
```{r }
p <- ggplot(r_corr_vol, aes(x = index(r_corr_vol), y = corn.soybeans)) + geom_line()
ggplotly(p)
```
### Wheat and Soybeans (90 day rolling correlation)
```{r }
p <- ggplot(r_corr_vol, aes(x = index(r_corr_vol), y = wheat.soybeans)) + geom_line()
ggplotly(p)
```
###Correlation drill-down
```{r}
#library(psych)
ALL_df <- data.frame(corn=ALL.r[,1], wheat=ALL.r[,2], soybeans=ALL.r[,3])
pairs.panels(ALL_df)
```
row {.tabset }
-----------------------------------------------------------------------
### Market Leverage: Corn volatility
```{r}
library(magick)
img <- image_graph(res = 96)
datalist <- split(r_corr_vol, r_corr_vol$year)
out <- lapply(datalist, function(data){
p <- ggplot(data, aes(corn.vols, corn)) +
geom_point() +
ggtitle(data$year) +
geom_quantile(quantiles = c(0.05, 0.95)) +
geom_quantile(quantiles = 0.5, linetype = "longdash") +
geom_density_2d(colour = "red")
print(p)
})
dev.off()
#img <- image_background(image_trim(img), 'white')
animation <- image_animate(img, fps = .5)
animation
```
### Market Spillover: Wheat into Corn
```{r}
library(magick)
img <- image_graph(res = 96)
datalist <- split(r_corr_vol, r_corr_vol$year)
out <- lapply(datalist, function(data){
p <- ggplot(data, aes(wheat.vols, corn.wheat)) +
geom_point() +
ggtitle(data$year) +
geom_quantile(quantiles = c(0.05, 0.95)) +
geom_quantile(quantiles = 0.5, linetype = "longdash") +
geom_density_2d(colour = "red")
print(p)
})
dev.off()
#img <- image_background(image_trim(img), 'white')
animation <- image_animate(img, fps = .5)
animation
```
### Corn Dependency on Wheat
```{r}
plot(summary(fit.rq.corn.wheat), parm = "log(wheat.vols)", main = "Corn-Wheat correlation sensitivity to Wheat volatility", xlab = "quantiles", ylab = "sensitivity")
```
Optimization
=======================================================================
row {.tabset}
----------------------------------------------------------------------
```{r, include=FALSE}
library(quantreg)
x <- data.r/100
n <- nrow(x)
p <- ncol(x)
alpha <- c(0.1, 0.3) # quantiles
w <- c(0.3, 0.7) # distortion weights
lambda <- 100 # Lagrange multiplier for adding up constraint
m <- length(alpha)
# error handling: if (length(w) != m) stop("length of w doesn't match length of alpha")
xbar <- apply(x, 2, mean)
mu.0 <- mean(xbar)
y <- x[, 1] #set numeraire
r <- c(lambda * (xbar[1] - mu.0), -lambda * (xbar[1] - mu.0))
X <- x[, 1] - x[, -1]
R <- rbind(lambda * (xbar[1] - xbar[-1]), -lambda * (xbar[1] - xbar[-1]))
R <- cbind(matrix(0, nrow(R), m), R)
f <- rq.fit.hogg(X, y, taus = alpha, weights = w, R = R, r = r)
fit <- f$coefficients
# transform regression coeff to portfolio weights
pihat <- c(1 - sum(fit[-(1:m)]), fit[-(1:m)])
x <- as.matrix(x)
yhat <- x %*% pihat # predicted
etahat <- quantile(yhat, alpha)
muhat <- mean(yhat)
qrisk <- 0
for (i in 1:length(alpha)) qrisk <- qrisk + w[i] * sum(yhat[yhat < etahat[i]])/(n * alpha[i])
qrisk
pihat
```
```{r, include=FALSE}
library(quantreg)
#library(dplyr) # use data.df now
alpha <- 0.95
u <- quantile(data.df$returns.corn, alpha )
x <- data.df.nd[data.df.nd$returns.corn < u, 2:4]/100
n <- nrow(x)
p <- ncol(x)
alpha <- c(0.01, 0.1) # quantiles at lower (negative) tail
w <- c(0.95, 0.05) # distortion weights
lambda <- 100 # Lagrange multiplier for adding up constraint
m <- length(alpha) #alpha and w length must be the same
xbar <- apply(x, 2, mean)
mu.0 <- mean(xbar)
y <- x[, 1] #set numeraire
r <- c(lambda * (xbar[1] - mu.0), -lambda * (xbar[1] - mu.0))
X <- x[, 1] - x[, -1] # set up design matrix of adjusted all but numeraire returns
R <- rbind(lambda * (xbar[1] - xbar[-1]), -lambda * (xbar[1] - xbar[-1])) # constraints
R <- cbind(matrix(0, nrow(R), m), R) #augmented constraints
f <- rq.fit.hogg(X, y, taus = alpha, weights = w, R = R, r = r) #Bob Hogg estimator
fit <- f$coefficients
# transform regression coeff to portfolio weights
pihat <- c(1 - sum(fit[-(1:m)]), fit[-(1:m)])
x <- as.matrix(x)
yhat <- x %*% pihat # predicted
(etahat <- quantile(yhat, alpha))
(muhat <- mean(yhat))
qrisk <- 0
for (i in 1:length(alpha)) qrisk <- qrisk + w[i] * sum(yhat[yhat < etahat[i]])/(n * alpha[i])
qrisk
pihat
```
### Extreme frontier finance
```{r }
mu.0 <- xbar
mu.P <- seq(-.0005, 0.0015, length = 100) ## set of 300 possible target portfolio returns
qrisk.P <- mu.P ## set up storage for quantile risks of portfolio returns
weights <- matrix(0, nrow=300, ncol = ncol(data.r)) ## storage for portfolio weights
colnames(weights) <- names(data.r)
for (i in 1:length(mu.P))
{
mu.0 <- mu.P[i] ## target returns
result <- qrisk(x, mu = mu.0)
qrisk.P[i] <- -result$qrisk # convert to loss risk already weighted across alphas
weights[i,] <- result$pihat
}
qrisk.mu.df <- data.frame(qrisk.P = qrisk.P, mu.P = mu.P )
mu.P <- qrisk.mu.df$mu.P
mu.free <- 0.0003 ## input value of risk-free interest rate
sharpe <- ( mu.P-mu.free)/qrisk.P ## compute Sharpe's ratios
ind <- (sharpe == max(sharpe)) ## Find maximum Sharpe's ratio
ind2 <- (qrisk.P == min(qrisk.P)) ## find the minimum variance portfolio
ind3 <- (mu.P > mu.P[ind2]) ## find the efficient frontier (blue)
col.P <- ifelse(mu.P > mu.P[ind2], "blue", "grey")
weights.extr <- weights[ind,] # for use in calculating tengency risk measures
qrisk.mu.df$col.P <- col.P
p <- ggplot(qrisk.mu.df, aes(x = qrisk.P, y = mu.P, group = 1)) + geom_line(aes(colour= col.P, group = col.P)) + scale_colour_identity()
p <- p + geom_point(aes(x = 0, y = mu.free), colour = "red")
options(digits=3)
p <- p + geom_abline(intercept = mu.free, slope = (mu.P[ind]-mu.free)/qrisk.P[ind], colour = "red")
p <- p + geom_point(aes(x = qrisk.P[ind], y = mu.P[ind]))
p <- p + geom_point(aes(x = qrisk.P[ind2], y = mu.P[ind2]))
ggplotly(p)
```
### Extreme portfolio risk measures
```{r }
price.last <- as.numeric(tail(prices[, 2:4], n=1))
# Specify the positions
position.rf <- weights.extr #c(1/3, 1/3, 1/3)
# And compute the position weights
w <- position.rf * price.last
# Fan these the length and breadth of the risk factor series
weights.rf <- matrix(w, nrow=nrow(data.r), ncol=ncol(data.r), byrow=TRUE)
#head(rowSums((exp(data.r/100)-1)*weights.rf), n=3)
## We need to compute exp(x) - 1 for very small x: expm1 accomplishes this
#head(rowSums((exp(data.r/100)-1)*weights.rf), n=4)
loss.rf <- -rowSums(expm1(data.r/100) * weights.rf)
alpha.tolerance <- 0.95
u <- quantile(loss.rf, alpha.tolerance, names = FALSE)
fit.extr <- fit.GPD(loss.rf, threshold = u)
showRM(fit.extr, alpha = alpha.tolerance, RM = "ES", method = "BFGS")
```
### Markowitz efficient portfolio frontier
```{r }
R <- returns[,1:3]/100
names.R <- colnames(R)
mean.R <- apply(R,2,mean)
cov.R <- cov(R)
sd.R <- sqrt(diag(cov.R)) ## remember these are in daily percentages
#library(quadprog)
Amat <- cbind(rep(1,3),mean.R) ## set the equality constraints matrix
mu.P <- seq(-.0005, 0.0015, length = 100) ## set of 300 possible target portfolio returns
sigma.P <- mu.P ## set up storage for std dev's of portfolio returns
weights <- matrix(0, nrow=300, ncol = ncol(R)) ## storage for portfolio weights
colnames(weights) <- names.R
for (i in 1:length(mu.P))
{
bvec <- c(1,mu.P[i]) ## constraint vector
result <- solve.QP(Dmat=2*cov.R,dvec=rep(0,3),Amat=Amat,bvec=bvec,meq=2)
sigma.P[i] <- sqrt(result$value)
weights[i,] <- result$solution
}
sigma.mu.df <- data.frame(sigma.P = sigma.P, mu.P = mu.P )
mu.free <- .0003 ## input value of risk-free interest rate
sharpe <- ( mu.P-mu.free)/sigma.P ## compute Sharpe's ratios
ind <- (sharpe == max(sharpe)) ## Find maximum Sharpe's ratio
ind2 <- (sigma.P == min(sigma.P)) ## find the minimum variance portfolio
ind3 <- (mu.P > mu.P[ind2]) ## finally the efficient frontier
col.P <- ifelse(mu.P > mu.P[ind2], "blue", "grey")
sigma.mu.df$col.P <- col.P
p <- ggplot(sigma.mu.df, aes(x = sigma.P, y = mu.P, group = 1)) + geom_line(aes(colour=col.P, group = col.P)) + scale_colour_identity() # + xlim(0, max(sd.R*1.1)) + ylim(0, max(mean.R)*1.1) +
p <- p + geom_point(aes(x = 0, y = mu.free), colour = "red")
options(digits=4)
p <- p + geom_abline(intercept = mu.free, slope = (mu.P[ind]-mu.free)/sigma.P[ind], colour = "red")
p <- p + geom_point(aes(x = sigma.P[ind], y = mu.P[ind]))
p <- p + geom_point(aes(x = sigma.P[ind2], y = mu.P[ind2])) ## show min var portfolio
p <- p + annotate("text", x = sd.R[1], y = mean.R[1], label = names.R[1]) + annotate("text", x = sd.R[2], y = mean.R[2], label = names.R[2]) + annotate("text", x = sd.R[3], y = mean.R[3], label = names.R[3])
ggplotly(p)
```
### Portfolio Weights
The weights for the Markowitz tangency portfolio ("*") are
```{r echo = FALSE}
library(scales)
weights[ind2,][1,]
name <- colnames(weights)
posn <- ifelse((weights[ind,]<0), "go short (sell)", "go long, (buy)")
value <- percent(abs(weights[ind,]))
```
The weights for the QR tangency portfolio are
```{r echo = FALSE}
library(scales)
weights.extr[1,]
```
For the working capital accounts:
1. 100 million overall portfolio
2. Buy 31 million in corn
3. Buy 17 million in wheat
4. Buy 52 million in soybeans
###Risk Accessment
We need to know how much cash are needed to support the risk tolerance and threshold. To accomplish that, we set that the organization's tolerance for risk at 5%: that is, a maximum of 5% of the time could losses exceed 10%. The organization also sets the collateral at 2% and a 25% standard deviation. Then we need to solve w in the equation:
$$
z = \frac{- 0.1 - 0.1w - 0.02 ( 1 - w )}{0.25w}
$$
or
$$
NormalInverse [ Normal ( \frac{- 0.12 - 0.12w}{0.25w} )] = NormalInverse(0.05)
$$
We calculate NormalInverse(0.05) by performing
```{r, echo=TRUE}
qnorm(0.05)
```
then, to solve w:
```{r, echo=TRUE}
risky <- -0.12/(0.25*(-1.64)+0.12)
paste('The weight invested in the risky contract', risky)
```
Therefore, the risky contract value = 42% of the portfolio value.
Loss measurement
=======================================================================
### Empirical loss
```{r}
# Get last prices
price.last <- as.numeric(tail(prices[, 2:4], n=1))
# Specify the positions
position.rf <- c(1/3, 1/3, 1/3)
# And compute the position weights
w <- position.rf * price.last
# Fan these the length and breadth of the risk factor series
weights.rf <- matrix(w, nrow=nrow(data.r), ncol=ncol(data.r), byrow=TRUE)
#head(rowSums((exp(data.r/100)-1)*weights.rf), n=3)
## We need to compute exp(x) - 1 for very small x: expm1 accomplishes this
#head(rowSums((exp(data.r/100)-1)*weights.rf), n=4)
loss.rf <- -rowSums(expm1(data.r/100) * weights.rf)
loss.rf.df <- data.frame(Loss = loss.rf, Distribution = rep("Historical", each = length(loss.rf)))
## Simple Value at Risk and Expected Shortfall
alpha.tolerance <- .95
VaR.hist <- quantile(loss.rf, probs=alpha.tolerance, names=FALSE)
## Just as simple Expected shortfall
ES.hist <- median(loss.rf[loss.rf > VaR.hist])
VaR.text <- paste("Value at Risk =\n", round(VaR.hist, 2)) # ="VaR"&c12
ES.text <- paste("Expected Shortfall \n=", round(ES.hist, 2))
title.text <- paste(round(alpha.tolerance*100, 0), "% Loss Limits")
p <- ggplot(loss.rf.df, aes(x = Loss, fill = Distribution)) + geom_histogram(alpha = 0.8) + geom_vline(aes(xintercept = VaR.hist), linetype = "dashed", size = 1, color = "blue") + geom_vline(aes(xintercept = ES.hist), size = 1, color = "blue") + annotate("text", x = VaR.hist, y = 40, label = VaR.text) + annotate("text", x = ES.hist, y = 20, label = ES.text) + xlim(0, 0.4) + ggtitle(title.text)
ggplotly(p)
```
Extremes
=======================================================================
Row {.tabset}
-----------------------------------------------------------------------
### Mean excess loss
```{r}
data <- as.vector(loss.rf) # data is purely numeric
umin <- min(data) # threshold u min
umax <- max(data) - 0.1 # threshold u max
nint <- 100 # grid length to generate mean excess plot
grid.0 <- numeric(nint) # grid store
e <- grid.0 # store mean exceedances e
upper <- grid.0 # store upper confidence interval
lower <- grid.0 # store lower confidence interval
u <- seq(umin, umax, length = nint) # threshold u grid
alpha <- 0.95 # confidence level
for (i in 1:nint) {
data <- data[data > u[i]] # subset data above thresholds
e[i] <- mean(data - u[i]) # calculate mean excess of threshold
sdev <- sqrt(var(data)) # standard deviation
n <- length(data) # sample size of subsetted data above thresholds
upper[i] <- e[i] + (qnorm((1 + alpha)/2) * sdev)/sqrt(n) # upper confidence interval
lower[i] <- e[i] - (qnorm((1 + alpha)/2) * sdev)/sqrt(n) # lower confidence interval
}
mep.df <- data.frame(threshold = u, threshold.exceedances = e, lower = lower, upper = upper)
loss.excess <- loss.rf[loss.rf > u]
p <- ggplot(mep.df, aes( x= threshold, y = threshold.exceedances)) + geom_line() + geom_line(aes(x = threshold, y = lower), colour = "red") + geom_line(aes(x = threshold, y = upper), colour = "red") + annotate("text", x = 0.1, y = 0.07, label = "upper 95%") + annotate("text", x = 0.1, y = 0.01, label = "lower 5%")
ggplotly(p)
```
####
The upper red line and the lower red line forms a boundary to tell us what is happening to our exceedences. The middle, black line is the shape of parameter at various threshold and their corresponding exceedences.
### GPD fits and starts
```{r}
data <- as.vector(loss.rf) # data is purely numeric
umin <- min(data) # threshold u min
umax <- max(data) - 0.1 # threshold u max
nint <- 100 # grid length to generate mean excess plot
grid.0 <- numeric(nint) # grid store
e <- grid.0 # store mean exceedances e
upper <- grid.0 # store upper confidence interval
lower <- grid.0 # store lower confidence interval
u <- seq(umin, umax, length = nint) # threshold u grid
alpha <- 0.95 # confidence level
for (i in 1:nint) {
data <- data[data > u[i]] # subset data above thresholds
e[i] <- mean(data - u[i]) # calculate mean excess of threshold
sdev <- sqrt(var(data)) # standard deviation
n <- length(data) # sample size of subsetted data above thresholds
upper[i] <- e[i] + (qnorm((1 + alpha)/2) * sdev)/sqrt(n) # upper confidence interval
lower[i] <- e[i] - (qnorm((1 + alpha)/2) * sdev)/sqrt(n) # lower confidence interval
}
mep.df <- data.frame(threshold = u, threshold.exceedances = e, lower = lower, upper = upper)
loss.excess <- loss.rf[loss.rf > u]
#library(QRM)
alpha.tolerance <- 0.95
u <- quantile(loss.rf, alpha.tolerance , names=FALSE)
fit <- fit.GPD(loss.rf, threshold=u) # Fit GPD to the excesses
xi.hat <- fit$par.ests[["xi"]] # fitted xi
beta.hat <- fit$par.ests[["beta"]] # fitted beta
data <- loss.rf
n.relative.excess <- length(loss.excess) / length(loss.rf) # = N_u/n
VaR.gpd <- u + (beta.hat/xi.hat)*(((1-alpha.tolerance) / n.relative.excess)^(-xi.hat)-1)
ES.gpd <- (VaR.gpd + beta.hat-xi.hat*u) / (1-xi.hat)
n.relative.excess <- length(loss.excess) / length(loss.rf) # = N_u/n
VaR.gpd <- u + (beta.hat/xi.hat)*(((1-alpha.tolerance) / n.relative.excess)^(-xi.hat)-1)
ES.gpd <- (VaR.gpd + beta.hat-xi.hat*u) / (1-xi.hat)
# Plot away
VaRgpd.text <- paste("GPD: Value at Risk =", round(VaR.gpd, 2))
ESgpd.text <- paste("Expected Shortfall =", round(ES.gpd, 2))
title.text <- paste(VaRgpd.text, ESgpd.text, sep = " ")
loss.plot <- ggplot(loss.rf.df, aes(x = Loss, fill = Distribution)) + geom_density(alpha = 0.2)
loss.plot <- loss.plot + geom_vline(aes(xintercept = VaR.gpd), colour = "blue", linetype = "dashed", size = 0.8)
loss.plot <- loss.plot + geom_vline(aes(xintercept = ES.gpd), colour = "blue", size = 0.8)
#+ annotate("text", x = 300, y = 0.0075, label = VaRgpd.text, colour = "blue") + annotate("text", x = 300, y = 0.005, label = ESgpd.text, colour = "blue")
loss.plot <- loss.plot + xlim(0,0.4) + ggtitle(title.text)
ggplotly(loss.plot)
```
### Confidence and risk measures
```{r}
showRM(fit, alpha = 0.95, RM = "ES", method = "BFGS")
```
Insights
=======================================================================
row {.tabset}
-------------------------------------------------------------------------
### Key Insights
1. All crops have experienced spikes in price and magnitude percentage change. Common shared factors are climate and competitors. Corn vs wheat and corn vs soybeans have moderate positive correlation (corn vs wheat is 0.63; corn vs soybeans is 0.58.
2. The value at risk of the crops mix would be matched to the market performance ergo value of the commodities. The timing of the farming directly correlates to the proportionate projected volatility of the crops mix. It shows that certain mixes of crops and will provide us with less volatility and more value for our risk associated. Corn, wheat and soybeans all have a very low mean. They also have smaller standard deviations which proves to be more consistent and at less risk when developing what will be necessary to achieve optimal size and mix.
3. The working capital account of \$100 million should be allocated as follows: buy \$31 million in corns, \$17 million in wheat, $52 million in soybeans.
Row
------------------------------------------------------------------------
### References
**R Packages used**
ggplot, scales, quadprog, quantreg, shiny, flexdashboard, qrmdata, xts, matrixStats, zoo, QRM, plotly, and psych
**References**
- Github and Stack Overflow for trouble shooting
- Brian Peirce for the help in brainstorming business questions
- www.thebalance.com
- www.macrotrend.net
- https://turing.manhattan.edu/~wfoote01/finalytics/_site/index.html
- https://bookdown.org/wfoote01/faur/
- http://www.futuresknowledge.com